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NASA Goddard Radiation Effects and 
Analysis Group (REAG) FPGA Testing 
Supporters and Collaborators 


• Supporters: 

- Defense Threat Reduction Agency (DTRA) 

- NASA Electronics Parts and Packaging (NEPP) Program 

• Collaborators: 

- Xilinx 

- GSFC Satellite Servicing Capabilities Office (SSCO) 
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Acronyms 


Block random access memory (BRAM) 
Built-in-self-test (BIST) 

Combinatorial logic (CL) 

Configurable Logic Block (CLB) 

Device under test (DUT) 

Digital Clock Manager (DCM) 

Digital Signal Processing Block (DSP) 

Distributed triple modular redundancy 
(DTMR) 

Dual interlocked storage cell (DICE) 
Edge-triggered flip-flops (DFFs) 

Field programmable gate array (FPGA) 

Global triple modular redundancy 
(GTMR) 

Input - output (I/O) 

Linear energy transfer (LET) 

Local triple modular redundancy (LTMR) 
Look up table (LUT) 

Low cost digital tester (LCDT) 

Mitigated DCM (MITDCM) 


Power on reset (POR) 

Probability of logic masking (P| 0gic ) 

Radiation Effects and Analysis Group 
(REAG) 

Single-event effects Immune 
Reconfigurable FPGA (SIRF) 

Single event functional interrupt (SEFI) 
Single event latchup (SEL) 

Single event transient (SET) 

Single event upset (SEU) 

Single event upset cross-section (o SEU ) 
Static random access memory (SRAM) 
System on a chip (SOC) 

Universal Serial Bus (USB) 

Virtex-5QV (V5QV) 

Windowed Shift Register (WSR) 


To be presented by Melanie Berg at the Electronic Technology Workshop (ETW), Green belt, MD, June 18th, 2014 


Virtex-5QV Investigation Overview 

* This is an independent study to determine the single 
event destructive and transient susceptibility of the 
Xilinx Virtex-5QV (SIRF) device. 

* The DUT is configured to have various test structures 
that are geared to measure specific potential 
susceptibilities of the device. 

* Design/Device susceptibility is determined by 
monitoring DUTs for Single Event Transient (SET) and 
Single Event Upset (SEU) induced faults while exposing 
them to a heavy ion beam. 

* Test strategies are based on the NASA Goddard REAG 
FPGA SEU Test guidelines manual : 

https://nepp.nasa.gov/files/23779/fpga_radiation_test_guidelines_2012.pdf 
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Characterizing SEUs: Radiation Testing 

and SEU Cross Sections 



SEU Cross Sections (<j se J characterize how many 
upsets will occur based on the number of ionizing 
particles to which the device is exposed. 


# errors 

^ seu 

Terminology: 

• Flux: Particles/(sec-cm 2 ) 

• Fluence: Particles/cm 2 

o seu is calculated at several LET values 
(particle spectrum) 

Testing with a low flux is imperative 
with the Xilinx V5QV due to the 
complexity of the device versus the 
accelerated rate of exposure. 


fluence 
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Understanding SEU Data and Their 
Applications to Complex Designs 

Along with providing o SEU data, aspects of how 
the data were obtained are discussed, such as: 

- Related test structure(s), 

- Speed of operation, and 

- Reasoning of test strategy. 

A goal of SEU radiation testing is to eventually 
extrapolate SEU data to critical applications 
(designs). 

Designs are complex. Without an 
understanding of how and why data are 
obtained, extrapolation will be inaccurate and 
can be detrimental to the success of a mission. 
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FPGA Structure Categorization as 
defined by NASA Goddard REAG 
o SEU Differentiation: 


m 


^ ^ error ^ ^Configuration 

Design CT SE u Configuration CTsEU 


+ P(fi) 


+p 

functionalLogic SEFI 

Functional logic SEFI o SEU 

^SEU 



Sequential and 
Combinatorial 
logic (CL) in 
data path 



Global Routes 
and Hidden 
Logic 


SEU Testing is required in order to characterize the 

<j SE u for each of FPGA categories 
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V5QV Accelerated SEU Testing 
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Best Practice for Radiation Test 
Setups: Functional Control 

Types of DUT functional input control: clocks, 

resets, and data inputs. 

Concerns: 

- Synchronizing inputs and managing skew between 
inputs. Challenging with high frequencies. 

- Operating the device in a realistic manner: 

• Do not over-load the device with unrealistic stimulus during 
radiation testing. If the device is operating in states that 
would never occur, then radiation data will not be 
characteristic. 

• Do not under-load the device during radiation testing. If the 
device is underperforming, this means that a large amount 
of circuitry is not operating. This produces operational 
states with a large amount of logic masking; consequently, 
radiation data will not be characteristic. 
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Best Practice for Radiation Test 
Setups: Power Control 



Types of voltage controllers: power supplies 
and special on-board voltage regulation 
circuitry. 

Concerns: 

- Device may draw a larger amount of current than 
originally expected. Cooling apparatus may be 
necessary during operation. 

- Power glitching or Single Event Latch-up (SEL) can 
cause the system to cease operation or be damaged. 
Hence it is best practice to separate test vehicle 
power from DUT power. It is also ideal to have 
current limiting circuitry for the test vehicle and the 
DUT. 
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Best Practice for Radiation Test /mh 
Setups: Monitoring Functional Upsets 

Compare DUT outputs to expected values. This 
can be done: 

- Visually (only recommended as a supplement); i.e., 
watching the error indication on the error detection 
equipment (e.g., logic analyzer or oscilloscope); 

- Using equipment event triggers; or 

- Custom comparison circuitry. 

Differentiate upset types: e.g., clock tree SET, DFF 
SEU, CL captured SET, or configuration faults. 

Count SEUs (upset statistics): After the upsets 
have been detected and differentiated, they need 
to be counted. The higher the number of upsets, 
the better the statistics. 
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Best Practice for Radiation Test Setups: 
Automated Data Capture and Messaging 

• Reliably capture data: 

- Follow synchronous design rules - which include 
how to capture asynchronous signals. 

- Determine minimal sampling frequency (when 
applicable). 

- Understand the limitations of the automated test 
equipment with respect to the DUT (e.g., memory 
(storage) space and speed). 

• Once erroneous data are captured, they should 

be packaged and stored (e.g., sent to a host 

PC). 

- Timestamp 

- Expected value 

- Received value 
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Example of a V5QV Error Record 


183:181 

171:136 

50 

49:48 

47:24 

23:0 






W 

i 

i 

i 
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i 1 

1 l 

1 I 

i | 

l | 

( \ 

i 

l 

STATUS 

TIME 

ERROR 

DATA 

PREVIOUS 

CURRENT 

FLAGS 

STAMP 

FLAG 

PATTERN 

DATA VALUE 

DATA VALUE 


Field 

# of Bits 

Description 

Current Value 

24 

Current captured data (cycle N) 

Previous Value 

24 

Previous captured data (cycle N-l) 

Data Pattern 

2 

Unused: data pattern is always checkerboard 

Error Flag 

1 

Unused 

Time Stamp 

32 

Cycle counter. Must multiply by the DUT frequency to 
convert to time. Used to determine error burst sequences 

Status 

3 

Indicates type of error record: 

“001” is a timeout — one of the shift clocks not detected 
“01 l”Out of timeout — all shift clocks are recovered 
“000” Error or non-error — current value does not equal 
previous value 

“010” Debug check — command was sent to check value 
settings 
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Best Practice for Radiation Test 
Setups: Monitoring Power 

Use of power supply monitors or specialized on 
board (tester) circuitry. 

Use of an automated monitor/capture system is 
beneficial. Provides the ability to perform post 
processing on power data and to identify 
particular error signatures. 

As previously mentioned, the ability to 
automatically power down or limit current if the 
DUT current gets too high is beneficial. 

For accelerated V5QV SEU testing, we used 

all of the above. 
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Test Structure Configuration 
Mitigation: Scrubbing Specifics 

Scrubbing is the act of simultaneously writing into FPGA 
configuration memory as the device’s functional logic 
area is operating with the intent of correcting 
configuration memory bit errors. 

Too many upsets in the system (due to accelerated flux) 
can cause unrealistic behavior... unrealistic o SEU s! 

Can manage the accelerated upset rate by varying flux. 

Make sure scrubbing can keep up with your upset rate. 

During irradiation, our scrub rate for the Xilinx V5QV is 
once every 100 ms (10 Hz). 

Read-back after a test with scrubbing should have a 
minimal number of configuration-bit upsets (excluding 
un-scrubbable bits). FPGA: field p r °g rammable g ate arra v 

7 q seu : SEU cross-section 
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V5QV Test Set-up 


RS23 
Sends 

Commands cjyd 
Receives Test 
Info 



USB: 

Send Scrubber bit 
file to LCDT SRAM 


DUTCLOCK 


RX232 

TX232 


General 
Tester 
Hardware 
with USB 
and RS232 
Controller 



>ST 


LOW C 
DIGITRC 
TESTER 


Data 

Processing 


DUTRESET 

Pattern 

Select 

SelectMap 

Interface 



DUT 

Outputs 
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Configuration and BRAM Testing 
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Procedures for Configuration and 

BRAM Testing 

Basic Configuration and BRAM Static Test: 
- Load FPGA configuration; 



- Irradiate device while the device is in a static 
state (no scrubbing of configuration memory); 

- Stop radiation beam and read back the 
configuration; 

- Count configuration and BRAM upsets; and 

- Normalize the upsets by the number of particles 
of exposure (Configuration and BRAM SEU cross 
section - o SEU ). 

All tests (regardless of type) include configuration 

read-back after each beam-run. 
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(cm 2 )/bit 


Commercial versus Hardened 
Configuration Memory Heavy-ion SEU 

Data 

o SEU s of Virtex-5 Configuration Bits: 
Commercial (V5) versus Hardened (V5QV) 
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(cm 2 )/bit 


Commercial versus Hardened BRAM 

Heavy-ion SEU Data 

<r SEU s of Virtex-5 BRAM Bits: Commercial (V5) 

versus Hardened (V5QV) 
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Investigating SEFIs 

We look for particular error signatures to determine 

SEFI occurrence: 

Read-back of configuration is mostly logic ‘O’ - 
assume a Power On Reset (POR) glitch. 

Unable to connect to the device to read-back - 
assume problem in the configuration interface. 

- Hidden (to user) state machines 

- Configuration registers 

Global upsets in functional logic - not performed 
during static readback. 

- Reset correction: clock tree or reset tree (global routing) 

- Configuration correction: configuration bit upset - not 
considered a SEFI 
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(cm 2 )/device 


V5QV Configuration SEFIs 

Configuration SEFI o SEU 

1.0E-05 

1.0E-06 

1.0E-07 
LU 

(/) 1.0E-08 

C 
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0 10 20 30 40 50 60 70 

LET MeVcm 2 /mg 

Most SEFI error signatures were large areas of the 
configuration bits forced to ‘O’. This resembles a power on 

reset (POR) hit. 
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Xilinx V5QV Heavy Ion Accelerated 

Testing: 

Functional Data Path (dynamic 
operation) and Functional SEFIs (i.e., 

global routes) 
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Xilinx V5QV Heavy Ion Accelerated 

Testing: 

Test Structure Development 



-We start with simple test structures. 

-We increase complexity per test structure. 

-We study trends. 

-We try to make sense out of the convoluted data 
obtained from complex test structures. 

Test Structure Considerations Taken from the NASA Goddard 

RE AG FPGA SEU Test Guidelines: 
https://nepp.nasa.gov/files/23779/fpga_radiation_test jguidelin 

V5 is a commercial Xilinx filed prograi^Sble^^e < i r 2/ Pdf V5-QV is a radiation-tolerant device 
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Best Practice for Radiation Testing: 
Logic Replication for Statistics 



Best-Practice for DUT Test 
Structure Development 


Test structures should contain a 
large number of replicated logic 
in order to increase statistics. 


SEU testing with hundreds of 
counters versus only one 



100’s or 1000’s of 
DFF stages in a 
shift register 
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Best Practice for Radiation Testing 
State Space Traversal 


Best-Practice for DUT 
Test Structure 
Development 


A test structure’s state space 
should be traversable such 
that it can be covered within 
one radiation test run. 

Otherwise: 

* A significant amount of 
circuitry and system states 
are not tested. 

• The result is SEU data that 
are uncharacteristic of the 
design. 
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Best Practice for Radiation Testing: 

Logic Masking 


Best-Practice for DUT Test Structure Development 


Logic masking should be minimized or controllable (i.e., 
taken into account). 

Any logic gate with more than one input will have logic 
masking except for XOR or XNOR gates. 


P i 0 g ic is the probability that an upset will 
be masked from being captured by the 
system. 

P logic = 0 : path is 100% masked 
P logic = 1 ■ path has no masking 



<1 
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Best Practice for Radiation Testing: 
Avoiding Unrealistic SEU Accumulation 


Best Practice 
characteristics of a DUT 

design 


Avoid unrealistic SEU 
accumulation from accelerated 
testing: 

• Use flush through test 
structures; e.g., shift- 
registers. 

* Small number of gates per 
sub-test structure; e.g., 
testing hundreds of 
counters. 


SRAM Based FPGAs: 
Scrubbing (correcting) 
configuration SEUs. 
Extremely important during 
accelerated testing... must 
keep up with the particle flux 
to avoid accumulation. 
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Best Practice for Radiation Testing: 

Increasing Visibility 



All (or a significant percentage of) potential upsets 
should be observable during testing. 

Test structures can be designed to enhance observable 
nodes; e.g., shift-registers, counters, scan rings, internal 
logic taps. 



If an SEU occurs, 
will it propagate to 
I/O before the test 
is complete? 
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Difference between Test Structure and 
Application Specific Design 


A test structure is a design implemented in a 
DUT that is created specifically for SEU testing. 

An application-specific design is circuitry 
implemented in a DUT that is either the final 
design targeted for space or a subset of the 
final design. 
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Use of Test Structures versus 
Application Specific Designs for 
Acquiring o S pu Data 

Although error rates ana error responses are 
design dependent, useful information can be 
extrapolated from test structures versus 
application specific designs. 



Why use test structures? 

- By the time the final design is complete, it is usually 
too late to perform radiation testing on it. 

- It can be too difficult to apply input-stimuli to an 
application specific design. 


- It can be too difficult to monitor DUT responses of 

application specific designs. 

Test Structures can be constructed to meet SEU-testing 

best practice guidelines. 
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Additional Challenges using Application 
Specific Designs for SEU Testing 

Statistics are poor, usually because there is not 
a significant amount of replication. 

In addition, trends for specific elements are not 
able to be clearly identified/established. 

The state space of a complex design cannot be 
traversed within one radiation test run. 

Application-specific designs contain a 
significantly higher number of masked data 
paths than test structures. 

It is difficult to control SEU accumulation in an 
accelerated test environment. 

Many best practice considerations are violated. 



To be presented by Melanie Berg at the Electronic Technology Workshop (ETW), Green belt, MD, June 18th, 2014 


32 


Benefits of Testing Application 
Specific Designs 

Increase observation error responses specific 

to the application. 

However, the user must be aware of the 

following: 

- Unrealistic SEU accumulation in an accelerated 
environment, 

- Limited visibility due to masking and fractional state 
space traversal, 

- Poor statistics due to the variance in design circuits, 
and 

- o SEU s will most likely have a large variance if circuits 
are not able to be isolated and controlled. 
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Test Structures used for Dynamic 

V5QV SEU Testing 


Test Structure 

Frequency 

Range 

Additional Fault 
Tolerance 

Shift registers 

2 kHz - 300 MHz 

Yes 

Counters 

2 kHz -150 MHz 

Yes 

Global routes 

2 kHz -150 MHz 

Yes 

MicroBlazeTM 

50 MHz 

Yes 

Digital Signal 
Processors (DSP 
blocks) 

2 kHz- 150 MHz 

No 
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Test Structures: Shift Registers 



* Shift registers are great for baselining o SEU s. 

* Simple architecture with no masking. 

* Large number of stages are easily implemented to 
achieve good statistics. 


Caveats to traditional shift 
register SEU testing 

NASA Goddard REAG’s 
solution 

High speed data input 
synchronization 

Internal data generation 

High speed data output Windowed shift registers 

capture 

Use of built-in-self-test 
(BIST) counter for SEUs 

With the use of WSRs, no 
need for a BIST counter 
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Test Structures: Windowed Shift 

Registers (WSRs) 



N levels of Inverters 
between DFF stages 
N = 0, 8, and 18 


D SET Q 


> 


Q 


D SET Q 


■> 


Q 



DFF = D flip flop 
4-bit Window Output 


Shift Register Chain 


> 


Q 


D SET Q 


> 


Q 


D SET Q 


■> 


Q 


D SET Q 


> 


Q 


D SET Q 



D SET Q 

> 



> 

CLR Q 



CLR Q 






D SET Q 


-> 


Q 


D SET Q 


■> 


Q 


D SET Q 


> 


Q 


D SET Q 


■> 


Q 


8 > ^dly 

WSR 


T path delay from DFF to DFF 


0g > 


WSR 


8 



Combinatorial Logic: Inverters 
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Test Structures: Inserting Mitigation 

* V5QV embedded SEU Filter option: not available in the 
commercial FPGA device (it is V5QV specific). 

* LTMR: user implemented. Do not use in the Virtex 
commercial family of devices... it is useless. However, it 
might be an option in the V5QV...see data section. 

* DTMR: user implemented. 

- Implemented with and without area constraints 

- Can be used in the commercial device 

* GTMR: user implemented. 

- Implemented with and without area constraints 

- Can be used in the commercial device 

* Configuration memory scrubbing: user implemented and 
can be used in the commercial and the V5QV devices. 
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Test Structures: V5QV Embedded 
Single Event Transient (SET) Filters 



The V5QV has embedded SET filters placed on 
the data input and clock input of each DFF. 
Usage is optional. 


Filters are expected to reduce the effects and 
the capture of SETs. 


Xilinx reports that the SET filters reduce 
susceptibility. 

NASA Goddard REAG has verified this claim. 

SET: Single Event Transient; 

DFF: flip-flop 

REAG = Radiation Effects and Analysis Group is 
part of Code 561 at NASA/GSFC 



Id set q 



> 



CLR Q 

J 
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Test Structures: Local Triple Modular 

Redundancy (LTMR) 



Comb 

Logic 


Loc 


Comb 

.Logic 


DFF 



DFF 


LTMR 


Masks upsets from DFFs 

Corrects DFF upsets if 
feedback is used 

DFF = D flip flop 



Only the DFFs are 
triplicated and 
mitigated 





- — 




LTMR is a mitigation strategy that can can only be used in 
FPGAs with hardened configuration. It cannot be used in 
the commercial Virtex family of devices 
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Test Structures: Distributed Triple 
Modular Redundancy (DTMR): DFFs + 

Data Paths 

All DFFs with Feedback Have Voters 

DFF = D flip flop 














n 







w 


P(fJ error P configuration P(f ^0ffcthmalLogic P' 


OW 


Minimally 

,j y,Lowered 
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Test Structures: Global Triple Modular 
Redundancy (GTMR):DFFs + Data Paths 

Global Routes 

All DFFs with Feedback Have Voters 



DFF = D flip flop 
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SEU 


a DATA: Investigating Frequency 
Effects with WSRs at 5.7 MeV-cm 2 /mg 



WSR Strings Ar 0° 5.7 MeV-cm 2 /mg 


1.41E-08 

1.21E-08 

1.01E-08 

CM 

£ 8.10E-09 

O 

6.10E-09 

D 

4.10E-09 

2.10E-09 



-*-WSR0 LTMR Filter 
♦WSRO No TMR 
-*-WSR0 No TMR Filter 
-*-WSR8 LTMR Filter 
♦WSR8 No TMR 
-•-WSR8 No TMR Filter 
WSR16 LTMR Filter 
— WSR16 No TMR 
WSR16 No TMR Filter 


Speed (MHz) 
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Cross Section (cm2) 


q seu DATA: Investigating Frequency 
Effects with WSRs at 20.6 MeV-cm 2 /mg 


WSR Strings Kr 0° 20.6 MeV-cm 2 /mg 



Speed (MHz) 


— #-WSR0 LTMR Filter 
-■-WSR0 No TMR 
-±-WSR0 No TMR Filter 
-*-WSR8 LTMR Filter 
-■-WSR8 No TMR 
-•-WSR8 No TMR Filter 
WSR16 LTMR Filter 

WSR16 No TMR 

WSR16 No TMR Filter 
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SEu( Cm2 ) 


WSR with respect to LET 
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(cm 2 )/bit 


WSR 8 with respect to LET 

WSR g 10 kHz 
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(cm 2 )/bit 


WSR 16 with respect to LET 


WSR 16 10 kHz WSR 16 10 MHz 
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WSR SEU Testing: Conclusions 

• WSR test structures were used to analyze: DFF, SET 

Filter, frequency effects, and efficacy of various 

mitigation strategies. 

* SEU data illustrate the following: 

- Utilization of SET Filters provide approximately a decade of 
improvement of DFF SEU susceptibility when not using DCMs. 

- Frequency effects show that DFF SEUs dominate SETs in the 
functional data path. Hence, the embedded DICE mitigation 
strategy for the DFFs are not as strong as embedded LTMR 

- Implementing LTMR with filtering does not produce benefits 
over foregoing LTMR while using the filter option. 

- Implementing DTMR does decrease overall o SEU s, hut at an 
expensive price for area, power, and timing. 
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Test Structures and 
Heavy Ion SEU Results: 

No DCM 

Scrubber always turned on 
Lowest LET Tested = 1.8 MeV-cm 2 /mg 


Test Structure 

Frequency 

Range 

Additional 

Fault 

Tolerance 

Counters 

50 MHz 

No 

Global routes 50 MHz No 

DCM 

50 MHz 

Yes 






Counter Test Structures 


Once every 4 clock cycles, 

Output Top Mos t Value to Tester Then Shift Up the next Value 


Counter 0 


Counter 1 



Simultaneously 
Shift All Counters 
Into Register 
Bank once every 
480 =(4*120) 
Clock Cycles 


Counter 118 

► 


Counter 119 



o 


118 
1119 


Shift Up 
Registers 
Every 4 
Clock 
cycles 


4 


Cell(n-I) <= Ce l(n) 
once 

every 4 clock 
A cycles 


24 


Counter 

Processing 


Low Cost 
Digital 
Tester 


In order to study global structures, various clocking schemes are 
connected to all of the counters (and snap-shot array) via a clock tree: 

input Clock (no DCM) versus DCM. 
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Why Counter Arrays versus a String of 

Counters? 


Counter Array 

String of Counters 

Upsets are isolated per counter 
unless the upset is from a global 
route. 

Counters are co-dependent - hence it 
can be difficult to differentiate 
between a multiple bit upset, single 
bit or global route. 


A custom tester can resynchronize 
with a counter that incurs an SEU. 
Built-in -self-test (BIST) is not 
necessary. 

Full state space traversal 


String of Counters 


tote* 


GO 




CoO 




Goo 

Counters accumulate 


Goo 


rite* 


Implementing a string of counters is 
complex arithmetic. It can be difficult 
to resynchronize with an error 
consequently BIST is usually 
necessary. 

Usually implemented with simple data 
patterns due to the complexity - 
hence state space traversal is 
extremely limited. 


C=> 


Q 


O | 



Output to Tester 
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Differentiating SEUs: 

Upset types for SEU Analysis 

* Global Upsets: a SEU s for a sequence of upsets that last 
greater than a snap shot cycle ...> 100’s /js : 

- Upsets are from the clock or reset and stem from the top of the 
global routing tree. 

- Could be clock, DCM, or buffer located high in the global 
routing tree. 

- Error signature is sporadic -and does not resemble a stuck 
fault as with a configuration bit SEU induced error. 

* Burst: cr SEU s for a sequence of upsets that occur within 
a snap shot cycle (<100’s fjs ): 

- Upsets are from low in the global routing tree. 

* Single Bit: o SEU s for DFF (bit) flips in a counter. 

* Snap Shot: o SEU s for DFF (bit) flips in the snap shot 
array. 
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SEU Cross Section cm 2 


Comparison of Various Component 
o SE u s with SET Filter Off 
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No DCM SET Filter Off 


♦ 


Counter/Bit 


Burst/Device 


20 
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Effective LET MeV-cm 2 /mg 


70 


80 



90 


Upsets start to converge at higher LETs. However, DFF upsets 

are dominant. 
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SEU Cross Section cm 2 



Comparison of Various Component 
a sEu s SET Filter On 



0 10 20 30 40 50 60 70 80 90 

Effective LET MeV-cm 2 /mg 

Upsets start to converge at higher LETs. DFF upsets are less 

dominant than with the SET Filter off. 
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SEU Cross Section cm 2 /bit 


Counter a SEU s: SET Filter Off versus 

SET Filter On 



No DCM: SET Filter Off versus SET Filter On Full LET Range 
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SET Filter On only makes a difference at low LET values. 
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SEU Cross Section cm 2 /bit 


Counter a SEU s: SET Filter Off versus SETj 
Filter On: Zooming in on Low LET Values 




0 2 4 6 8 10 12 14 16 18 20 22 

Effective LET MeV-cm 2 /mg 

SET Filter On decreases o SEU s approximately 1.5 decades. SET 

Filter On also increases on-set LET. 
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SEU Cross Section cua 2 /bil 


a ,s in Counter versus Snapshot 

Register 



0 20 40 60 80 100 

Effective LET MeV-cm 2 /mg 

Counters have a higher cross section than Snap Shot. 
Counters are active every cycle; Snap Shot is only active 
every 4 cycles. Counters have more complex circuitry than 

Snap Shot. 
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SEU Cross Section cm 2 /device 


Comparing Global o SEU s SET Filter Off 

versus SET Filter On 
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Off 



■ Global SET Filer On 
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Effective LET MeV-cm 2 /mg 

SET Filter On increases the on-set LET for Global o SEU s . 
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SEU Cross Section cm 2 /device 
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Global versus Burst o SEU s 

No DCM: Global versus Burst SEUs 
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Effective LET MeV-cm 2 /mg 


Bursts are prevalent at Low LETs with SET Filter On or 
Off. However, Bursts slightly decrease at Low LET values 

with the SET Filter On. 
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Test Structures and 
Heavy Ion SEU Results: 

With DCM 

Scrubber always turned on 
Lowest LET Tested = 5.7 MeV-cm 2 /mg 


Test Structure 

Frequency 

Additional 


Range 

Fault 



Tolerance 

Counters 

50 MHz 

No 

Global routes 

50 MHz 

No 


Additional data on counters will be provided in the 
test report and future papers 
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Functional Logic Radiation Test 
Structures: Digital Clock Manager (DCM) 



PSDONE 



■r 

CLK90 
CLK180 
CLK270 
CLK2X 
CLK2X180 
► CLKDV 

CLKFX 
CLKFX180 

LOCKED 
^ STATUS [7:0] 


Clock ^ 
Distribution 
Delay 


Counter 

Structures 






We are testing DCM susceptibility by connecting the 
block to a design with a state space with feedback. 
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SEU Cross Section cm 2 


a s for DCM Utilization versus No DCM 
Utilization with SET Filter On 



DCM: SET Filter On 
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SEU Cross Section cm2/device 


Mitigated DCM (MITDCM) versus DCM 



MITDCM SET Filter ON 


DCM SET Filter On 
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Counter SEU Testing: Conclusions 

* Counter-array test structures were used to analyze: DFF, 
SET Filter, global route, and DCM SEU susceptibility. 

* SEU data illustrate the following: 

- Utilization of SET Filters provide approximately a decade of 
improvement of DFF SEU susceptibility when not using DCMs. 

- Utilization of SET filters increase LET on-SET for global routes; 
however do not do much for bursts. 

- Usage of DCMs significantly increase SEU susceptibility and may 
make SET filter utilization impractical. 

- DCM mitigation strategy did not help o SEU s and proved to be an 
unworthy choice. 

• Differentiating <r SEU s is used to investigate SEU dominance 
and can be applied to determining component usage. 

• Additional data and how they correlate to the o SEU s 
illustrated in this presentation will be provided in the final 
test report. 
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Test Structures and 
Heavy Ion SEU 
Scrubber always turned on 
Lowest LET Tested = 5.7 MeV-cm 2 /mg 


Test Structure 

Frequency Range 

Additional Fault 
Tolerance 

DSP48E 

10 MHz -150 MHz 

No 

Global routes 10 MHz - 150 MHz No 


Additional data on counters will be provided in the 
test report and future papers. 
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Virtex 5 Family Digital Signal 
Processing Blocks (DSPs): DSP48E 



ernal to the DSP48E column. They are not accessible via fabric routing resources. 


Virtex-5 FPGA XtremeDSP Design Considerations www.xilinx.com 
UG193 (v3.5) January 26, 2012 


There are a total of 320 DSP48E blocks in the Virtex- 


5QV (XC5VFX130T). 


To be presented by Melanie Berg at the Electronic Technology Workshop (ETW), Green belt, MD, June 18th, 2014 


65 


Test Structures: Strings of DSP48Es 

with TMR’d BIST 



A = Constant 

B = Registered (delayed) input 


BIST: Built in Self Test 

TMR: Triple Modular Redundancy 


C = input from last stage for accumulation 
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Test Structures: String of DSP 

Logistics 

• DSP48E’s are programmed to perform: AB + C. 

- A string of DSP48Es accumulate each of the 
products to form a polynomial: 

N = number of stages in the String of DSPs; Y(n) = the n’th output 

N 

Y(n) = A 0 B(n) + A.,B(n-1)+.... A N (n-N)= ^A^n-i) 

/= 0 

- String of DSPs are widely used in Finite Impulse 
Response (FIR) filters and image processing. 



• Although prior slides suggested not to use 
BIST, when dealing with complex circuitry, BIST 
is advantageous. 

• Note that the BIST compares are triplicated. 
Voting is done in the tester. Minimal circuitry. 
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(cm 2 /DSP) 


DSP48E a Data 



^KDSP Bank 
^►DSP Bank 
DSP Bank 
^^DSP Bank 
^►DSP Bank 
^MDSP Bank 
DSP Bank 
DSP Bank 



100MHz 

50MHz 

10MHz 

100MHz 

150MHz 

150MHz 

50MHz 

10MHz 
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DSP48E SEU Testing: Conclusions 

* SEU data illustrate the following: 

- Frequency does not seem to affect o SEU s. 

- Configuration upsets have little affect on DSP48Es. 

- ct seu s are fairly low for the amount of processing power. 

* Additional data and how they correlate to the o SEU s 
illustrated in this presentation will be provided in the 
final test report. 
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Test Structures and 
Heavy Ion SEU 
Scrubber always turned on 
Lowest LET Tested = 5.7 MeV-cm 2 /mg 


Test Structure 

Frequency 

Range 

Additional 

Fault 

Tolerance 

MicroBlaze™ 

50 MHz 

No 

Global routes 50 MHz No 

Caching 

50 MHz 

Yes 


Additional data on counters will be provided in the 
test report and future papers. 
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Processor and SRAM Communication 

wsW^ 0 '' 

Processors talk to memory. 

Micro-blaze™ 


SRAM: Static random access memory 
BRAM: Block random access memory 


Most processor 
radiation tests 
detect errors by 
erroneous SRAM 
memory writes. 

Visibility is 
significantly limited 




Cache 


SRAM 
Interfac 




ALU 




LCDT 




using FPGA 
RAM 


Data Write 

We increase visibility by replacing external SRAM 
with the REAG low-cost digital Tester (LCDT). 
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More on Increasing Visibility with 
Microprocessor Testing (1) 


As previously stated, the embedded SRAM in 
the tester (BRAM) takes the place of normal 
memory accesses. 

In addition, each memory access is time- 
stamped and logged in alternate bank of BRAM. 
Only the last 512 accesses are kept. 

After each test run, the time stamped logs are 
output to the user. 

Read 


Timestamp 

§ 

i ADDR 

DATA 


Write Address 
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More on Increasing Visibility with 
Microprocessor Testing (2) 



DUT: device under test 


DUT 


Halted 
Error 

Trace Instruction 
Trace Valid Instruction 
Trace Exception Taken 
Trace Exception Kind 
Trace Register Write 
Trace Register Address 
Trace data cache Request 
Trace data cache Hit 
T race Data cache Ready 
T race Data cache Read 


Trace Instruction cache Re quest 
Trace Instruction cachehjn : 


TESTER 


Watchdogs 



Send 


watchdog 
errors to host 
PC 
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MicroBlaze™ SEU Testing: 
Conclusions 

* Visibility was increased by isolating memory accesses 
as follows: 

- Moving the instruction and data storage to the LCDT for traffic 
observation. 

- Performing tests with and without cache to determine the 
influence cache has on upsets. 

* Differentiating global upsets from the normal data set: 

- Helped to understand which upsets are prominent. 

- Gave insight to how the use of cache will affect o SEU s. 

* Monitoring internal Micro-blaze™ signals: 

- ct seu s are not reliant on detecting erroneous memory read and 
writes anymore. Data are too limited and uninformative with 
sole reliance on memory reads and writes. 

- Can now determine when a processor crashes and how. 



To be presented by Melanie Berg at the Electronic Technology Workshop (ETW), Green belt, MD, June 18th, 2014 


74 


(cm 2 /design) 


Comparing Micro-blazeTM os and 

Global Clock o SEU s 

SEU Cross Sections: 

Cache vs. No Cache with Global Routes 
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Summary 

* We presented a framework for evaluating 
complex digital systems targeted for harsh 
radiation environments, such as space. 



• If performing accelerated SEU testing on an 
application specific design: 


- Understand limitations in testing resultant data; 

- Be prepared for complex data de-convolution; 

- Pay attention to global structures; 

- Use basic-test structures to obtain an underlying 
understanding of DUT SEU behavior; and 

- Maximize visibility - especially when testing 
application-specific designs. 
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